arXiv: 1509.04541vl [stat.ML] 15 Sep 2015 


When are Kalman-Filter Restless Bandits Indexable? 

Christopher Dance and Tomi Silander 
Xerox Research Centre Europe, Grenoble, France 

May, 2015 


Abstract 

We study the restless bandit associated with an extremely simple scalar Kalman filter 
model in discrete time. Under certain assumptions, we prove that the problem is indexable 
in the sense that the Whittle index is a non-decreasing function of the relevant belief state. 
In spite of the long history of this problem, this appears to be the first such proof. We use 
results about Schur-convexity and mechanical words, which are particular binary strings 
intimately related to palindromes. 


1 Introduction 

We study the problem of monitoring several time series so as to maintain a precise belief 
while minimising the cost of sensing. Such problems can be viewed as POMDPs with belief- 
dependent rewards [3] and their applications include active sensing [7] , attention mechanisms for 
multiple-object tracking |22] . as well as online summarisation of massive data from time-series 
[1]. Specifically, we discuss the restless bandit [23] associated with the discrete-time Kalman 
filter [T2] . 

Restless bandits generalise bandit problems lam to situations where the state of each arm 
(project, site or target) continues to change even if the arm is not played. As with bandit 
problems, the states of the arms evolve independently given the actions taken, suggesting that 
there might be efficient algorithms for large-scale settings, based on calculating an index for 
each arm, which is a real number associated with the (belief-)state of that arm alone. However, 
while bandits always have an optimal index policy (select the arm with the largest index), it 
is known that no index policy can be optimal for some discrete-state restless bandits m and 
such problems are in general PSPACE-hard even to approximate to any non-trivial factor [10] . 
Further, in this paper we address restless bandits with real-valued rather than discrete states. 

On the other hand. Whittle proposed a natural index policy for restless bandits [23], but 
this policy only makes sense when the restless bandit is indexable, as we now explain. Say 
we have n restless bandits and we are constrained to play m arms at each time. Whittle 
considered relaxing this constraint by only requiring that the time-average number of arms 
played is m. Now the optimal average cost for this relaxed problem is a lower bound on the 
optimal average cost for the original problem. Also, the relaxed problem can be separated into 
n single-arm problems by the method of Lagrange multipliers, making it relatively easy to solve. 
In this separated version of the relaxed problem, each arm behaves identically to an arm in 
the original problem, except that an additional price A is charged each time the arm is played, 
where A corresponds to the Lagrange multiplier for the relaxed constraint. Now let us consider 
a family of optimal policies which achieves the optimal cost-to-go Qi{x, u; A) for a single arm i 
with price A and which takes actions u = Tri(x; A) when in state x where u = 0 means passive 
and u = 1 means active. At first glance, we might intuitively suppose that it becomes less and 
less attractive to be active as the price A increases so that as the price is increased beyond some 
value Xi{x), the optimal action switches from active to passive. At this price we are ambivalent 
between being active and passive so that Qi{x,0; Xi{x)) = Qi{x,l', Xi{x)). Such a value Xi(x) 
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is called the Whittle index for arm i in state x. Indeed if there is a family of optimal policies 
for which 

TTi^x; Ahi) < T^iix; Aio) for all states x and all pairs of prices Ahi > Aio 

then an optimal solution to the relaxed problem for price A is to activate arm i if and only if 
A < Xi{x). If a restless bandit satisfies this condition, it is said to be indexable. It is important 
to note that some restless bandits are not indexable, so activating arm i if and only if A < Xi{x) 
does not correspond to an optimal solution to the relaxed problem. Indeed, in a study of small 
randomly-generated problems, Weber and Weiss [53] found that roughly 10% of problems were 
not indexable. 

As a policy based on Xi{x) is so good for the relaxed problem when the arms are indexable, 
this motivates us to use Xi{x) as a heuristic for the original problem. This heuristic is called 
Whittle’s index policy and at each time it activates the m arms with the highest indexes Xi{x). 
Further motivation for studying indexability is that for ordinary bandits the Whittle index 
reduces to the Gittins index, making the Whittle index policy optimal when only one arm may 
be active at each time, that is when m = 1. More generally, Whittle’s index policy is not 
optimal for some restless bandit problems even when the arms are indexable, but indexability 
is still a rather useful concept, since if all arms are indexable and certain other conditions hold, 
Whittle’s policy is asymptotically optimal, as we now explain. Consider a sequence of restless 
bandit problems parameterised by the number of indexable arms n and in which m = an of 
the arms can be simultaneously active for some fixed a S (0,1). Then as n tends to infinity, 
the time-average cost per arm for Whittle’s index policy converges to the time-average cost 
per arm for an optimal policy, provided a certain fluid approximation has a unique fixed point. 
This result was first demonstrated by Weber and Weiss [53] who for simplicity of exposition 
only considered the symmetric case in which the n arms have identical costs and transition 
probabilities. Recently, Verloop [5D| extended this result to asymmetric cases involving multiple 
types of arms. Interestingly, this extension also covers cases where new arms arrive and old 
arms depart. 

Restless bandits associated with scalar Kalman(-Bucy) filters in continuous time were re¬ 
cently shown to be indexable |12] and the corresponding discrete-time problem has attracted 
considerable attention over a long period [mimiiiiiii. However, that attention has produced 
no satisfactory proof of indexability - even for scalar time-series and even if we assume that 
there is a monotone optimal policy for the single-arm problem, which is a policy that plays the 
arm if and only if the relevant belief-state exceeds some threshold (here the relevant belief-state 
is a posterior variance). Theorem of this paper addresses that gap. After formalising the 
problem (Section 2), we describe the concepts and intuition (Section 3) behind the main result 
(Section 4). The main tools are mechanical words (which are not sufficiently well-known) and 
Schur convexity. As these tools are associated with rather general theorems, we believe that 
future work (Section 5) should enable substantial generalisation of our results. 


2 Problem and Index 


We consider the problem of tracking N time-series, which we call arms, in discrete time. The 
state Zi^t S M of arm i at time t € evolves as a standard-normal random walk independent 
of everything but its immediate past (Z+,IR_ and K+ all include zero). The action space is 
Id A^}. Action Ut = i makes an expensive observation Yi^t of arm i which is normally- 

distributed about Zi t with precision bi € IR+ and we receive cheap observations Yj t of each 
other arm j with precision Oj S K+ where Oj < bj and Uj = 0 means no observation at all. 

Let Zt,Yt,'Ht,J^t be the state, observation, history and observed history, so that 
Zt '■= {Zi^f, ■ ■ ■ 1 Zj^^t)iYt := (%!,(,..., Fat, t), := {{Zq,uo,Yq), ..., {Zt,Ut,Yt)) and ■= 

((uo, Fq), ..., (ut, Yt)). Then we formalise the above as (1. is the indicator function) 


Z,,o-Af(0,l), Z,^t+i\nt 


Yi^t \ Ht-i, Zt,ut 


Af 



a* 
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Note that this setting is readily generalised to 7 ^ 1 by a change of variables. 

Thus the posterior belief is given by the Kalman filter as Zi^t \ JFt ~ Af{Zi^t,Xi^t) where the 
posterior mean is Zi t G M and the error variance Xi^t S K+ satisfies 


Xi^t+i = where (t)i,o{x) 


X + l 

ttiX + a^ + l 


and (j)i^i(x) 


X -\-l 

biX + + 1 


( 1 ) 


Problem KFl. Let tt be a policy so that Ut = 7 r(J^t_i). Let x^^ be the error variance 
under tt. The problem is to choose tt so as to minimise the following objective for discount 
factor P G [0,1). The objective consists of a weighted sum of error variances with weights 
Wi G M+ plus observation costs hi G K+ for * = 1,..., N: 


E 


'00 N 


{h^^ut=i + WixJ^} 

_t=0 i—1 


00 N 






where the equality follows as § is a deterministic mapping (and assuming tt is deterministic). 

Single-Arm Problem and Whittle Index. Now fix an arm i and write x^, 4>o{-),... 
instead of x^i, (j)ifi (-),.... Say there are now two actions Ut = 0,1 corresponding to cheap and 
expensive observations respectively and the expensive observation now costs h + v where G K. 
The single-arm problem is to choose a policy, which here is an action sequence, tt := (mq, ui ,...) 


so as to minimise V'^{x\n) {{h-\-i/)utwx^} where a;o = a;. (2) 

t=o 

Let (5(a;, a|z/) be the optimal cost-to-go in this problem if the first action must be a and let tt* 
be an optimal policy, so that 

Q{x^ a\i^) := {h + n)a + wx + {(j)a{x)\i^)■ 


For any fixed x G M+, the value of v for which actions uq = 0 and uq = 1 are both optimal is 
known as the Whittle index {x) assuming it exists and is unique. In other words 

The Whittle index {x) is the solution to Q{x,0\X^(x)) = Q(x, l|A'^(a;)). (3) 


Let us consider a policy which takes action uq = a then acts optimally producing actions (x) 
and error variances xf*{x). Then ^ gives 

00 00 

'^P* {{h + X^{x))u°t* +wa;?*(a:)} = {{h + X^{x))ul* + (a:)} . 

t^O 


Solving this linear equation for the index gives 


A'^(cc) 


EZiP\^rix)-xrix)) 


- h. 


(4) 


Whittle [23] recognised that for his index policy (play the arm with the largest A^(a;)) to make 
sense, any arm which receives an expensive observation for added cost v, must also receive an 
expensive observation for added cost v' < v. Such problems are said to be indexable. The 
question resolved by this paper is whether Problem KFl is indexable. Equivalently, is A'^(a:) 
non-decreasing in a: G M+? 


3 Main Result, Key Concepts and Intuition 

We make the following intuitive assumption about threshold (monotone) policies. 
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Figure 1: Orbit x^*{x) traces the path ABODE... for the word Olic = 01101. Orbit x\*{x) 
traces the path FGHIJ... for the word lOrc = 10101. Word w = 101 is a palindrome. 


Al. For some x € K+ depending on v & the policy Ut = lxt>x is optimal for problem 

Note that under Al, definition (|^ means the policy Ut = '^xt>x is also optimal, so we can 
choose 


uT{x) 

u]*{x) 


if x^*_i{x) < X 

otherwise 

if x\*_i{x) < X 
otherwise 


and 


and 


xT{x) ■■= 


xl*{x) := 


Mxt-iix)) 

Mxt-iix)) 

Mxl-iix)) 


if x^f_i{x) < X 
otherwise 

if x\'t-y{x) < X 
otherwise 


(5) 


where a:Q*(a;) = xl*{x) = x. We refer to Xf*(x), x^*(x) as the x-threshold orbits (Figure [^. 

We are now ready to state our main result. 

Theorem 1. Suppose a threshold policy (Al) is optimal for the single-arm problem §). 
Then Problem KFl is indexable. Specifically, for any b > a > 0 let 


(j)o{x) 


a: + 1 

ox + a + 1 ’ 


(fiix) 


X + 1 

6x + 6 + 1 


and for any w € M+, h gM. and 0 < (3 < 1, let 


A'^(x) :=^ 


j:T=iP\xnx)-xl*{x)) 


— h 


( 6 ) 


in which action sequences Uf*(x),u(*(x) and error variance sequences x°*(x), X(*(x) are given 
in terms of(j)Q,(j)i by Then A'^(x) is a continuous and non-decreasing function of x G K+. 

We are now ready to describe the key concepts underlying this result. 

Words. In this paper, a word ic is a string on {0,1}* with letter Wk and wvj := 
WiWi+i .. .Wj. The empty word is e, the concatenation of words u,v is uv, the word that is 
the n-fold repetition of w is w”, the infinite repetition of w is and w is the reverse of w, so 
w = w means ic is a palindrome. The length of w is |r(;| and |w|^ is the number of times that 
word u appears in w, overlaps included. 

Christoffel, Sturmian and Mechanical Words. It turns out that the action sequences 
in ([^ are given by such words, so the following definitions are central to this paper. 

The Christoffel tree (Figure is an infinite complete binary tree [5] in which each node 
is labelled with a pair {u,v) of words. The root is (0,1) and the children of {u,v) are (u,uv) 
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( 0 , 1 ) 


( 0 , 01 ) 


( 01 , 1 ) 


( 0 , 001 ) 

/ \ 

(^0(^) (0(»i,()m) 


( 001 , 01 ) 

/ \ 

(OO^OOWl) (0(^01^1) 


lOOlAllOUltlll (OUIUOIOI.OO 


Ul.OOlOIOl) llXIIOlOl.OI 


( 01 , 011 ) 

/ \ 

(01^10^1) (oimi,mi) 


/^11’1\ 

( 011 , 0111 ) ( 0111 , 1 ) 

/ \ / \ 


Figure 2 : Part of the Christoffel tree. 


and {uv, v). The Christoffel words are the words 0,1 and the concatenations uv for all (u, v) in 
that tree. The fractions |uu|j^/|uu|q form the Stern-Brocot tree [3] which contains each positive 
rational number exactly once. Also, infinite paths in the Stern-Brocot tree converge to the 
positive irrational numbers. Analogously, Sturmian words could be thought of as infinitely- 
long Christoffel words. 

Alternatively, among many known characterisations, the Christoffel words can be defined 
as the words 0,1 and the words Orel where a := |0wl|j/|0wl| and 

(01w)„ := [(n -I- l)aj — [naj 

for any relatively prime natural numbers |0wl|g and |0wl|]^ and for n = 1,2,..., |0icl|. The 
Sturmian words are then the infinite words 0u;iW2 ■ ■ • where, for n = 1,2,... and a € (0,1)\Q, 

(01wi?«2 •••)«•= L(^ + 1)“J “ ■ 

We use the notation Owl for Sturmian words although they are infinite. 

The set of mechanical words is the union of the Christoffel and Sturmian words [13j . (Note 
that the mechanical words are sometimes dehned in terms of infinite repetitions of the Christoffel 
words.) 

Majorisation. As in [T3], let x,y G M"* and let and be their elements sorted in 
ascending order. We say x is weakly supermajorised by y and write x y if 

j 3 

'^X(^k) for all j = l,...,m. 

If this is an equality for j = m we say x is majorised by y and write x ^ y. It turns out that 

j j 

X < y ^ E X[k] < E j/[fe] for j = 1,..., m — 1 with equality for j = m 

k—1 k—1 

where X[k],y[k] are the sequences sorted in descending order. For x,y G M™ we have [14] 

m m 

X < y ^ E f{xf) < E f{yi) for all convex functions / : K —M. 

i=l i=l 

More generally, a real-valued function (p defined on a subset A of M"* is said to be Schur-convex 
on A ii x ^ y implies that 4){x) < 4){y). 

Mobius Transformations. Let ijla{x) denote the Mdbius transformation ijla{x) := 
where A G Mobius transformations such as tpo(‘), <Pi(‘) are closed under com¬ 

position, so for any word w we dehne 4>w{x) := <(>iu|„| o • • • o o (x) and (pe(x) := x. 
Intuition. Here is the intuition behind our main result. 

For any x G ffi+, the orbits in (l5l correspond to a particular mechanical word 0,1 or Owl 
depending on the value of x (Figure H. Specifically, for any word u, let be the fixed point 
of the mapping (pu on M+ so that puiVu) = yu and G K+. Then the word corresponding 
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0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 


|01w|j,/|01w|^ 

Figure 3: Lower fixed points yoiw of Christoffel words (black dots), majorisation points for 
those words (black circles) and the tree of (/)u,(0) (blue). 


to X is 1 for 0 < X < yi, Orel for x G [j/oim,2/iOuj] and 0 for j/o < 2 ; < 00 . In passing we note 
that these fixed points are sorted in ascending order by the ratio p := |01w|g/|01w|;^ of counts 
of Os to counts of Is, as illustrated by Figure Interestingly, it turns out that ratio p is a 
piecewise-constant yet continuous function of x, reminiscent of the Cantor function. 

Also, composition of Mobius transformations is homeomorphic to matrix multiplication so 
that 


° y'si.x) = plab{x) for any A,B G 

Thus, the index Q can be written in terms of the orbits of a linear system © given by 0,1 
or Orel. Further, if A G and det(A) = 1 then the gradient of the corresponding Mobius 

transformation is the convex function 

dpAjx) _ _ 1 _ 

dx (A2iX + A22)^' 

So the gradient of the index is the difference of the sums of a convex function of the linear- 
system orbits. However, such sums are Schur-convex functions and it follows that the index is 
increasing because one orbit weakly supermajorises the other, as we now show for the case Oiul 
(noting that the proof is easier for words 0,1). As Owl is a mechanical word, w is a palindrome. 
Further, if w is a palindrome, it turns out that the difference between the linear-system orbits 
increases with x. So, we might define the majorisation point for w as the x for which one 
orbit majorises the other. Quite remarkably, if w is a palindrome then the majorisation point 
is (jwiO) (Proposition]^. Indeed the black circles and blue dots of Figure [^coincide. Finally, 
(j)w{0) is less than or equal to yoiw which is the least x for which the orbits correspond to the 
word Owl. Indeed, the blue dots of Figure[^are below the corresponding black dots. Thus one 
orbit does indeed supermajorise the other. 


4 Proof of Main Result 

4.1 Mechanical Words 

The Mobius transformations of Q satisfy the following assumption for X := K+. We prove that 
the fixed point y^ of word w (the solution to (jw (x) = x on X) is unique in the supplementary 
material. 
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Assumption A2. Functions (jjQ : I —> I, (/>i : I —> I, where I is an interval of K, are 
increasing and non-expansive, so for all x,y € I: x < y and for k G {0,1} we have 

(j)k{x) < (j)k{y) and (j)k{y) - 4>k{x) < y - X . 

" -^-^ ^^ 

increasing non-expansive 

Furthermore, the fixed points yo,yi on I satisfy yi <yo- 

Hence the following two propositions (supplementary material) apply to 4>o,4>i of Q on 
F = . 

Proposition 1. Suppose A2 holds, x G F and w is a non-empty word. Then 

X < 4>wix) 4>wix) < yw X <yw and x > 4>wix) 4>wix) > y™ x > y^,. 

For a given x, in the notation of we call the shortest word u such that (u\*,u\*,... ) = 
the x-threshold word. Proposition [^generalises a recent result about a;-threshold words in a 
setting where linear [TB] , 

Proposition 2. Suppose A2 holds and Oiul is a mechanieal word. Then 
Owl is the x-threshold word x G [yoiuj,yiouj]- 
Also, if xo,xi G F with xq > yo and xi < yi then the xq- and xi-threshold words are 0 and 1. 

We also use the following very interesting fact (Proposition 4.2 on p.28 of [S]). 
Proposition 3. Suppose Owl is a meehanical word. Then w is a palindrome. 


4.2 Properties of the Linear-System Orbits M{w) and Prefix Sums 

S{w) 

Definition. Assume that a,h G K+ and a < b. Consider the matrices 


F := 





and 


so that the Mobius transformations fj,F,TG are the functions 4>o,4>i of Q and GF — FG = 
{b — a)K. Given any word w G {0,1}*, we define the matrix produet M{w) 


M{w) := M(w|u,|) • • • M(wi), where M{e) := I,M{0) := F and M(l) := G 
where I G is the identity and the prefix sum S{w) as the matrix polynomial 


|tu| 

S(w) := ^ M(wi.,k), where S{e) := 0 (the all-zero matrix). (7) 

fe=i 

For any A G let tr(A) be the trace of A, let Aij = [A]ij be the entries of A and let A > 0 

indicate that all entries of A are non-negative. 

Remark. Clearly, det(F) = det(G) = 1 so that det(M(w)) = 1 for any word w. Also, S{w) 
corresponds to the partial sums of the linear-system orbits, as hinted in the previous section. 

The following proposition captures the role of palindromes (proof in the supplementary 
material). 

Proposition 4. Suppose w is a word, p is a palindrome and n G Z_|_. Then 
ffh+i A 

1. M{p) = for some f,hGR, 
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2. tr{M{10p)) = tr{M{01p)), 

3. IfuG {p(lOp)”, (10p)"10} then M{u) — M{u) = XK for some A G K_, 
4- If w is a prefix of p then [M(p(10p)"'10w)]22 < [-W(p(01p)"'01w)]22, 

5. [M((10p)”10w)]2i > [M((01p)”01t(;)]2i, 

6 . [M((10p)"l)]2i > [M((01p)"0)]2i. 


We now demonstrate a surprisingly simple relation between S{w) and M{w). 
Proposition 5. Suppose w is a palindrome. Then 

S 2 i{w) = M 22 {w) — 1 and S 22 {w) = Mi 2 {w) + S 2 i{w). 
Furthermore, if A^. := [5'(10r(;)M(w(10u>)^) — 5'(01r(;)M(w(01w)*)]22 then 

Ak = 0 for all k G Z_|_. 


( 8 ) 


(9) 


Proof. Let us write M := M{w),S := S{w). We prove ^ by induction on |r(;|. In the base 
case w G {e, 0,1}. For w = e, M 22 — 1 = 0 = S' 2 i,Mi 2 + S '21 = 0 = S' 22 - For w G {0,1}, 
M 22 — 1 = c = £' 215-^12 + 5'21 — I + c = S '22 for some c G {a,b}. For the inductive step, in 
accordance with Claim 1 of Proposition 19 assume w G {OuO, lul} for some word v satisfying 


M{v) = 


' fh+l 

- I \+f 

h ^-1 
h+f 


r 


S{v) = 


c d 

h — 1 f + h — I 


for some c,d,f,hG 


For w = lul, M := M{\vl) = GM{v)G and S := S(lul) = GM{v)G + S{v)G-\-G. Calculating 
the corresponding matrix products and sums gives 

S21 = {bh + hphf- l){bh + 2 h + bf + f + l){h + /)-! = M22 - 1 
S22 ~ S21 = bh 2 h + ^/ + / = Mi2 

as claimed. For w = OuO the claim also holds as F = This completes the proof of 

Furthermore Part. Let A := S{w)FG + FG + G and B := S{w)GF + GF + F. Then 

Afc = [{A{M{w)FG)'^ - B{M{w)GF)'^)M{w)]22 (10) 

by definition of S(-). By Claim 1 of Proposition [T^ and (|^ we know that 

j + for some c, d,/,/i G K. 

Substituting these expressions and the definitions of F, G into the definitions of A, B and then 
into (10) for fc G {0,1} directly gives Aq = Ai = 0 (although this calculation is long). 

Now consider the case k > 2. Claim 2 of Proposition [I^ says tr(M(10ic)) = tr(M(01w)) 
and clearly det(M(10w)) = det(M(01'u;)) = 1. Thus we can diagonalise as 


M(w)FG =: UDU 


-1 


M{w)GF =: VDV 


-1 


D := diag(A, 1/A) for some A > 1 


so that Afc = [AUD’^U-^M{w) - e^BVD'^V-^M{ w )\22 =■ liX'^ + l 2 X-^. So, if A = 1 then 
Afc = 7i + 72 = Aq and we already showed that Aq = 0. Otherwise A ^ 1, so Aq = Ai = 0 
implies 71+72 = 7iA + 72A“^ = 0 which gives 71 = 72 = 0. Thus for any k G we have 
Afc = 71A'^ + 72A-'= = 0. □ 








4.3 Majorisation 

The following is a straightforward consequence of results in m proved in the supplementary 
material. We emphasize that the notation has nothing to do with the notion of w as a 
word. 


Proposition 6. Suppose x,y G and / : K —> K is a symmetric function that is convex and 
decreasing on R+. Then X y and j3 ^ Llli/^V(a;(i)) > I]™!/^V(2/(i))- 

For any x € M and any fixed word define the sequences for n € and k — 1,..., m 

^nm+fc(^) ■ — (lOzc) i:/c)'a(x)] 2 5 ^ j • ■ ■ ■! 

yuTn+kix) := [M(( 01 w)”( 01 ty)i:fe)t>(a:)] 2 , := {ynm+l{x), . . ■ , ynm+Tnix)) 

where m := |10ii;| and v{x) := {x, 1)^. 

Proposition 7. Suppose w is a palindrome and x > (/)uj(0). Then cri”^ and are ascending 
seguences on IR_|_ and cri"^ cTy"^ for any n G Z+. 

Proof. Clearly (/)u,(0) > 0 so a; > 0 and hence v{x) > 0. So for any word u and letter c G {0,1} we 
have M{uc)v{x) = M{c)M{u)v{x) > M{u)v{x) > 0 as M(c) > I. Thus Xk+i{x) > Xk{x) > 0 
and yk+i{x) > yk{x) > 0. In conclusion, cri"^ and are ascending sequences on IR+. 

Now ■ Thus [Avi(j)^{0))]2 := ^ ^ 



^nm+/c{^if(0)) ynm+fc(^i(j(0)) 

= ^m{w)]22 - Mii01wn01w),..k))M{w)]^^ < 0 


for fc = 2,..., m by Claim 4 of Proposition 19 
is non-positive where 


So all but the first term of the sum Tm{4>wi0)) 


Tj(^x) :— ^ 'f Xnm+k{x^ ynm+kixf)f 


k=l 


Thus Ti(0u,(O)) > T 2 {(t)w{^)) > ■. .TmifiwiO))- But 


T 

J- r. 


i(0u,(O)) = ^ [(M((10u>)"(10u;)i:fc) - M((01u>)"(01w)i,fc))M(n;)] 


22 


1 


[M{w)] 


22 


[S'(10u>)M(r(;(10u;)”) - 5(01u;)M(u;(01w)”)]22 = 0 


where the last step follows from ([^. So Tj(^u,(0)) > 0 for j = 1,..., m. Yet Claims 5 and 6 of 
Proposition 19 give £^Tj{x) = X]fc=i[-^((10'“^)"'(10w)i:fc) — M((01w)"(01w)i:fc)]2i > 0. So for 
X > 4’w{0) we have Tj{x) > 0 for j = 1,..., m which means that tri”^ 


□ 


4.4 Indexability 

Theorem 1. The index (x) of is continuous and non-decreasing for x G K+. 

Proof. As weight w is non-negative and cost /i is a constant we only need to prove the result for 
A(x) := and we can use w to denote a word. By Proposition x G {ymw,yww\ 

for some mechanical word Oirl. (Cases x ^ (?/i, y^) are clarified in the supplementary material.) 

Let us show that the hypotheses of Proposition are satisfied by w and x. Firstly, w is a 
palindrome by Proposition]^ Secondly, (/),„oi(0) > 0 and as (j^wi,') is monotonically increasing. 
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it follows that 4>w o (/',uoi(0) > 0uj(O). Equivalently, tpoiw ° 0iu(O) > 0iu(O) so that 4>wifi) < ymw 
by Proposition Hence x > yoiw > 0iu(O). 

Thus Proposition applies, showing that the sequences and <7^\ with elements 

Xnm+k[x) and ynm+k{x) as dehned in (11), are non-decreasing sequences on M+ with 
(T^\ Also, 1/a;^ is a symmetric function that is convex and decreasing on K+. Therefore 
Proposition applies giving 

/ ^nm+fc —1 y^nm+/c—1 


E 


V (ynm-t-fc (^))^ 


> 0 for any n S where m := |01r(;|. 


( 12 ) 


Also Proposition shows that the ^-threshold orbits are ■ ■ ■) and 

{(j)i^{x),... ,(j)i^.j^{x),...) where u := (Olic)^ and I := (10w)“. So the denominator of ([^ 


IS 


1 -- 


= ^/3'"'=(1 - /3) ^ A(cr) = ^(^) " 


fc=0 


fc=0 


fc=l 


Note that for any eh-fg = l. Then (|^ gives 

dX{x) _1-P^ f pnm+k-l 

iVnm+kix))^ 


> 0 . 


But X{x) is continuous for x S K+ (as shown in the supplementary material). Therefore we 
conclude that A(a;) is non-decreasing for x G M+. □ 


5 Further Work 

One might attempt to prove that assumption A1 holds using general results about mono¬ 
tone optimal policies for two-action MDPs based on submodularity [5] or multimodularity [T]. 
However, we find counter-examples to the required submodularity condition. Rather, we are 
optimistic that the ideas of this paper themselves offer an alternative approach to proving Al. 
It would then be natural to extend our results to settings where the underlying state evolves 
as Zt+i I T-Lt -XfimZi,!) for some multiplier to ^ 1 and to cost functions other than the 
variance. Finally, the question of the indexability of the discrete-time Kalman filter in multiple 
dimensions remains open. 
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6 Supplementary Material: Introduction 


The results used but not proved in the main paper are given here as: 

• Proposition which was used to show that (/)u,(0) < x, 

• Proposition for the range of x giving a specific mechanical word, 

• Proposition |17| showing the index is continuous for x G K+, 


• Proposition 19 showing the properties of M{p) when p is a palindrome. 

• and Proposition for weak supermajorisation with /? 7 ^ 1. 

A clarification of the extreme cases of Theorem 1 of the main paper is presented in the final 
section. 


7 Prom ^-Threshold Policies to Mechanical Words 

Some concepts relating to mechanical words appeared as early as 1771 in Jean Bernoulli’s 
study of continued fractions (Berstel et al, 2008). The term “mechanical sequences” appears 
in the work of Morse and Hedlund (Am. J. Math., Vol 62, No. 1, 1940, p. 1-42) who had 
just introduced the term “symbolic dynamics”. Morse and Hedlund studied the concept from 
the perspective of sequences of the form [c -I- kp\ for c,/? G M and k G Z. They also studied 
the concept from the perspective of differential equations, motivating the term “Sturmian se¬ 
quences.” Since that time there has been tremendous progress in the study of such sequences 
from the perspective of Combinatorics on Words (Lothaire, 2001). However, the recent (and 
highly-approachable) paper of Rajpathak, Pillai and Bandyopadhyay (Chaos, Vol. 22, 2012) 
on the piecewise-linear map-with-a-gap discovers such sequences without recognising them as 
mechanical sequences. Proposition of this section is a substantial generalisation of that 
result and we could not find this proposition explicitly stated in the literature. Our result is 
not surprising if one has the intuition that there is a topological conjugacy between the maps 
of this section and the piecewise linear map-with-a-gap. However, it might be difficult to ex¬ 
plicitly identify the appropriate topological conjugacy and thereby prove our result for all cases 
considered here. 


7.1 Definitions 

Let TT denote a word consisting of a string of Os and Is in which the k*^ letter is tt^ and letters 

1.1 + 1,... ,j are Let |7 r| be the length of tt and |7r|^ for a word w be the number of 

times that word w appears in tt. Let e denote the empty word and 7r“ denote the infinite word 
constructed by repeatedly concatenating tt. 

Consider two functions (j)o ■. X ^ X and t/fi :X ^ X where X is an interval of M. We define 
the transformation : I —>■ I for any word tt by the composition 

07r (^) •- 07r|Tr| O • • • O ((>7^2 ^ 07ri (^) ■ 

Let gX he the fixed point of so ipTriVir) = assuming a unique fixed point on X exists. 
Given x G X, we call the sequence {xk ■ k > 1) the x-threshold orbit for (jjQ, (j)i if 


xi = 4>i{x), 


^fc + 1 — 


</>l (Xk ) 

f’O ixk ) 


ii Xk> X 
\i Xk < X 


ior k > 1 . 


We call TT the x-threshold word for /pQ, (j>i if it is the shortest word such that Xk+i = (^fc) 

for all /c > 1. We shall just write x-threshold orbit and x-threshold word where (fo, cfi are obvious 
from the context. 
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For p> 1, let Lp,Rp be the morphisms (substitutions) 


0-^ OF 
1 ^ OF+i 


0 ^ OP+^l 
1 ^ 0^1 


R„ 


We say tt is a valid word if tt € {0,1} or tt € {Lp{w), Rp{w) : p > 1} for some valid word w. 
Remark. The morphisms Lp, Rp generate the Christoffel tree so valid words are mechanical 
words. To see this, note that the Christoffel tree is generated by the following morphisms 
(Berstel et al, 2008, p. 37) 


G : 


0 -> 0 
1 ^ 01 


^[0^01 


We may translate (from English to French) as Lp = G^oD and Rp = o G so any composition 
of Lp and Rp can be written as a composition of G and D. Likewise, any composition of G and 
D can be written as a composition of Lp and Rp. Specifically H Pk,QkiPk+i > 2 then 

• • • o O 1)9'= o GP'=+1 oDo--- 

= • • • o (G^’''"^ oD)o (I)9<=-1 o G) O (G^"=+'-i oD)o--- 
= ■ ■ ■ o Lp^-l O Rq^-l o o • • • 


whereas if = 1 we have 


•••oGP'="^oZ)oGP'=+i oDo--- 

= • • • o (GP'="^ oD)o (G^’'=+i oD)o--- 
= ■■ - o Lpk-i o Lp^+i o • • • . 

A symmetric argument holds if = 1 or pfc+i = 1. 

7.2 Fixed Points 

Throughout, we make the following assumption about The existence of fixed points 

yo,yi is addressed immediately thereafter. 

Assumption A2. Functions : I ^ I, pi : X —> I, where I is an interval of M, are 
increasing and non-expansive. Equivalently, for all x,y G L : x < y and for k G {0,1} we have 

and pk{y) - Pk{x) < y - X . 

'• -V-^ 

non-expansive 

Furthermore, the fixed points yo,yi of po, pi satisfy yi <yo. 

Proposition 8. Suppose A2 holds, that x Gl and that w is any non-empty word. Then pw{x) 
is increasing and non-expansive. Further, the fixed point y^, exists and is unique. 

Proof. First we show that pw{x) is increasing, by induction. In the base case, licl = 1 and the 
claim follows from A2. For the inductive step assume pu{x) is increasing, where w = au for 
some a G {0,1} and word u. Then for any x,y gT : x < y, 

Pw{y) = PuiPaiy)) 

> PuiPaix)) as paiy) > pa{x) and pu is increasing 

= Pw{x). 

Therefore pw is increasing. 


pk{x) < pk{y) 

" -^-^ 

increasing 
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Now we show that 4’wix) is non-expansive, by induction. If jwl = 1 then this follows from A2. 
Else, say 4>u{x) is non-expansive where w = ua and a G {0,1}. Then for any x,y : x < y, 

(j)w{y) - 4>w{x) = 4>a{4>u{y)) - 4'a{4'u{x)) 

< 4>uiy) — 4'uix) as 4'u[y) > 4’uix) and 4>a is non-expansive 

< y — X as (/)„ is non-expansive. 

Therefore (j)^ is non-expansive. 

Let tpix) := max{0o(a^), (/)i(x)}. As (j)i is non-expansive we have 

2/1 = (t>i{yi) > (t>i{yo) + 2/1 - 2/0 

which rearranges to give 0i(j/o) < 2 /Oj so that V’(2/o) = 2/o- Also Tp is increasing as 4’Oj'Pi 
increasing, so pwivo) < = 2/o- 

We now prove that y^ exists. The argument of the previous paragraph shows that g{x) := 
X — pwix) satisfies 5 ( 1 / 0 ) > 0. A symmetric argument leads to the conclusion that g{yi) < 0. 
Clearly g{x) is a continuous function, so by the intermediate value theorem, there is some 
y G [ 2/0 j 2 / 1 ] foi' which g{jj) = 0. Equivalently y = pwiv)- Therefore a fixed point exists. 

To show that the fixed point is unique, suppose both y and z are fixed points with y > z. 
As Pin is non-expansive we have < 1. Yet, as pw{y) = 2/i Pw{z) = z we have 

Pw{y) - Pwjz) ^ ^ 
y- z 

This is a contradiction. Therefore the fixed point is unique. □ 

Given a word in, the next proposition shows when the transformation pm increases or de¬ 
creases its argument and what might be deduced from such an increase or decrease. 

Proposition 9. Suppose A2 holds, x G I and w is any non-empty word. Then 

X < pm{x) Pw{x) <ym ^ X < y^ and X > pm{x) pm{x) > ym X> ym- 


Proof. We use Proposition throughout the argument without further mention. 
Say X < ym- A.s pm is increasing. 


Pwipf) < pwijj'w') - Pw 

where the equality is the definition of ym- Also, as pm is non-expansive, 

Pw — Pwijjw') 'S pm{x) Pm X 

which rearranges to give x < pm{x). 

Now say x > Pm- As above, we then have pm{x) > pmiPw) = Vw and 

Pw = PwiPw) > Pw{x) -Gyw-x 


so that X > pm{x). 

The contrapositive oi x > pm ^ pw{x) > Pw is pw{x) < Pm ^ x < pm. But if pm(x) ^ pm 
then X Pm ss pm is increasing and therefore injective. Thus pm(x) < Pm ^ x < pm. 

The contrapositive oi x > Pm ^ x > pw{x) is x < pm{x) ^ x < Pm- But if x ^ Pw(x) then 
X Pm as Pm is a fixed point. So we can conclude that x ^ pmfx^ x ^ Pw 

By symmetry, pm(x) > Pm ^ x > pm and x > pw{x) ^ x > pm. This completes the 
proof. □ 

Proposition 10. Suppose A2 holds and tt is any word satisfying IttIqIttIx > 0- Then pi < Pir < 
2 / 0 - 
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Proof. Say j/tt < yi- As |7r|p > 0 we can write tt =: sOl'^ for some q > 0. Thus 


yTT = < (fsoiiivi) 

= (l)so{yi) 

> (l>s{yi) 

> yi 


as ipTr is increasing 
as (j)e{yi) = (/'i(yi) = yi 
by Proposition]^ 
by repeating the same argument if |s|p > 0. 


But this contradicts y-K <yi- Therefore > yi- 

A symmetrical argument leads to the conclusion that y-n < yo- 


□ 


Proposition 11. If A2 holds and n>l then yio"-i < yoiO"-i < yiQ^ yoi" < yioi"-i < 
yoi—1- 

Proof As yio"-i < yo by Proposition [Toj we have (/'o(yio"-0 > yio"-i by Proposition]^ so that 

= </'io"-i(<^o(yio"-i)) > </^io"-i(2/io"-i) = yio"-i 


so Proposition ]^ gives yoio"-i > yiO"-i- 

Furthermore yio" = ^o(yoiO"-0 by definition of yvr and yoio"-i 
that (/)o(yoiO"-0 > yoiO"-i by Proposition]^ Thus yio- > yoiO"-i- 
The proof that yoin < yioi'i-i < yoi"-i is symmetrical. 


< yo by Proposition 10 


so 


□ 


Proposition 12. Suppose A2 holds, M S {Lq,Rq : y > 1} and w is any word. Let y„ be the 
fixed point of (j)y := 4>m(v) for any word v and let Orel := M(0u;l). Then 

X € [yoiiDj yioii] X := </>09(i) G [yoiiu> yiou)]- 


Proof. Say M = Lq. Note that 

f’oiiyoiw) = 4>oi{yLq{oiw)) 

= 4'Oi{yoi01Lg(lw)) 

= yoiL,(liD)0‘i 

= yoiiu 


and 


as ijv is the fixed point of 4>y = (t>Lq(v) 
as L,(0) = 0«01 
as (faiyab) = yba for any words a, b 
as Owl = Lq{0wl) = 0Lg(lw)0^1 


4’oi{yiOw) = 0O9(yL,(iOM)) 

= 0O9(yO‘JlL,(Om)) 

= yiLq( 0 w) 0 i 

= yiow as Owl = Lq(0w)0'^l. 

Proposition ]^ shows that yoiiD,yiOii} exist. So the above equalities show that an inverse 
(pQq exists for x G {yniw^yiow}- As 4>og is increasing and continuous, we have 

X G [yoiij;;yioti’] X G {yoiw)j4^Qq {(yioti’)] “ [yoiw^yiow]- 

The proof for M = Rq is symmetric. □ 
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7.3 x-Threshold Words 

Proposition 13. Suppose A2 holds, tt is the x-threshold word and n> 1. Then 

1. X < pio^-i ^ = 0 

2. x> ?/oiO"-i ^ k“lio"-ii = 0 

3. x> ?/oi—1 ^ k“li" = 0 

4- X < ?/ioi"-i ^ |7r‘^|oi„-io = 0 


Proof, li X < yi then it follows from Proposition!^ that the x-threshold word is tt = 1. Likewise 
if X > 2/0 then the x-threshold word is tt = 0. In these cases Claims 1 and 2 hold, so in the 
following we assume that yi < x < yo- 

Claim 1: Let (xfc) the x-threshold orbit. If {'K'^)k-k+n -2 = 0”“^ for some k, then 

Xfc+„_i = 4>o^-i{xk) by definition of (xfe) 

> (j)Qn-i{(j)i{x)) as Xfc > (/'i(x) for all /c > 0 and is increasing 

> X if X < 2/io"-i by Proposition!^ 

But if Xfe+„_i > X then TTk+n-i = I by definition tt. Therefore = 0. 

Claim 2: Let (xfc) be the x-threshold orbit. If ['K^)k-.k+n-i = 10”“^ for some k, then 

Xk+n — 4^\e^ — ^{xkf) 

< (piQn.-i{(l)Q(x)) as Xk < 4>o{x) for all fc > 0 and is increasing 

= {x) 

< X if X > yoion-i by Proposition!^ 

But if Xk+n < X then {'K^)k+n = 0. Therefore = 0. 

The proof of Claims 3 and 4 is symmetrical. □ 

Proposition 14. Suppose A2 holds and tt is a x-threshold word. Then 

1 . KIqq > 0 tt = Ln{w) for some word w and some n > 1 

2. |7r|j^^ > 0 ^ TT = Rn(w) for some word w and some n> 1 


Proof. First, applying Claims 1 and 3 of Proposition 13 with n = 2 we have IttIqq = 0 for 
X < yio and |7r|j^j^ = 0 for x > j/oi- Furthermore yio = (^olz/oi) > Voi by Proposition^ Thus tt 
cannot contain both 00 and 11. 

So, if IttIqq > 0 then tt is of the form O'^^IO®! ... with strings of Os separated by individual 
Is. Let q := minfc Qk- By Propositions !IT] and !l^ Iq := (j/io9-i, yoio?) is the only set of x values 
for which tt‘^ can contain lO'^l. Thus 7r“ can only contain both lO'^l and lO'^’*'^! in the interval 


Fq Iq n Iq+l — (j/lO"!-! ) yOlO"! ) C (j/lO? ) 2/0109+0 — ( 2 / 109 , 2 / 0109 ) 

noting Proposition m] gives 2/109-1 < 2/0109-1 < 2/109 < 2 / 0109 . 

Finally, we have Fq n Fqi = 0 for g 7^: q' ^ which also follows from Proposition 
|7r|pQ > 0 then tt is a concatenation of Lq{0) and Lq{l). Equivalently tt = Lq(w) for some word 
w and some (/ > 1 as in Claim 1. 

The proof of Claim 2 is symmetric. 
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Thus if 


□ 


Proposition 15. Suppose A2 holds and tt is a x-threshold word. Then tt is a valid word. 
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Proof. There are three cases to consider: either |7r|pp = |7r|^^ = 0 or |7r|pp > 0 or |7r|^^ > 0. 

First case: The only non-empty words not containing 00 or 11 are 0,1, (01)", (10)” for 
some n > 1. Now a;-threshold words start with 0 unless x < yi (in which case tt = 1) so 
TT 7 ^ (10)". Further, the cc-threshold word was defined to be the shortest word such that such 
that Xfc_|_i = so this leaves us with the options 0,1,01. These are all valid words. 

Second case: If tt contains 00, we may write tt = Lq{w) for some word w, by Proposition [I^ 
Now from point Xk on the a;-threshold orbit we have TTk-.k+q = if and only if ipoq{xk) < x 
which corresponds to Xk < ‘^\x) =: x. So the word w corresponds to a ai-threshold orbit 

(ifc : fc > 1) for tPq{x) := (j)Qq+ii{x),Tpi{x) := (pQqi^x). To spell it out, we have 

Xi='ijji{x), Xk+i = Tpwd^k), 1 i for fc > 1 

10 11 Xk < X 

and as for the original system, we define as the fixed point y,r = 'f’TriVTr)- 

Now fjojifi are non-negative, as 4>Oi4’i are non-negative. Also 'tpojipi are monotonically 
increasing and non-expansive by Proposition]^ Further, 

4’oi+'^iiyoii) = f’onif’oiyoii)) > <Ao‘i 1 ( 2 / 091 ) = 2/09 1 

so that 2/09+11 > 2/091 by Proposition]^ But by definition f/g = 2 / 09+11 and yg = 2/09i) so that 
yi <yo- Therefore V'O;'0i satisfy A2. 

Third case: We prove that tt = Rq(w) for some positive integer q and word w. We also 
show that word w is a x-threshold word for a pair of functions (say) Xo,Xi which satisfy A2. 
The argument is symmetric to the second case, so it is omitted. 

In conclusion, either 

1. TT G {0,1,Li(l)} which are valid words 

2. TT = Lq(w) where w is a i-threshold word for V'Oj'^i which satisfy Propositions 
therefore w satisfies this conclusion 


and therefore w satisfies this conclusion. 

Thus TT is a valid word. This completes the proof. □ 

The following proposition shows that all valid words are x-threshold words and tells us 
explicitly which values of x produce a given valid word. It is one of the key results of the main 
paper. 

Proposition 16. Suppose A2 is satisfied and Owl is any valid word. Then 
Owl is the x-threshold word x G [ 2 /oiujj 2/iOtu]- 


3. or TT = Rq{w) where w is a x-threshold word for XOiXi which satisfy Propositions STM 



Proof Let Vi := {Lq{l),Rq{l) : q > l),Vn+i ■= {Lq{v),Rq{v) : v GVn,q> 1}. Note that Vi 
contains Lq{0) = O'^+^l = ^ 5 + 1 ( 1 ) and Rq{0) = 01“^ which for y > 2 equals Rq-i{l) and for 
< 7=1 equals 01 = Li(l). Thus U^;^I4 is the set of all valid words of form Owl. 

We use induction with hypothesis 

Hn : Owl G Vn is the x-threshold word x G (//oiujj 2/iOtu] 

Base case (Hi). Say Owl = O"^! is the x-threshold word. Then 

X > (()(ig<j).iigg-i (x) for all n > 0 

= ^^(0109-1)’* (^^'109-1 (a^)) 

=> X > lim (()(gigg-i)n ((()igg-i (x)) = J/gigg-i. 
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The definition of the x-threshold word also gives x < Therefore x > yiQn by Proposi¬ 

tion]^ Thus if O'^l is the x-threshold word then x € [yoiw, yww\- 

Now say x G [2/0109-1)2/109]- Proposition 10 gives 2/0 < 2; < 2/1 so that the x-threshold orbit 
{xk) is contained in {yo,yi)- So Proposition^^shows that (jj^ixu) > Xk and (j)i{xk) < Xk for all 
k > 0. So to prove that the x-threshold word is O'^l we need only show that ^^>( 109 ) 1109-1 {x) < x 
and (/)(i 09 )n (a;) > x for all n > 0 . But if x > 2/0109-1 then for all n > 0 


a; > ^(0109-1)1 (a;) by Proposition]^ 

> ^(0109-1)1 (^109-1 (x)) as 2/109-1 < 2/0109-1 < a; by Claim 3 of Proposition [TI] 

= '(^'(109)1109-1 (a;). 

Also if X < 2/109 then (/)(i09)i(x) > x for all n > 0 by Proposition]^ Therefore for Oiol = O'^l, 
we have x G [ 2 /oitu, 2/iOiu] implies that Owl is the x-threshold word. 

For Owl = Oia, the proof that tt = OP^ x G [yoiw, yww] is symmetric, so it is omitted. 
Inductive Step. Assume Owl satisfies Hn- 

Say Owl = Lq(0i()l). Let ki := |L,j(((0wl)‘^)i:i_i)| -I-1 so {Tr^)ki is aligned with the start of 
the letter of (Owl)'^. Let Xk '■= ^((lou,))^)!.^,(x),xi := Xfe.,x = (j)oi{x) and let denote the 
fixed point oi (py := (pL,(v) for any word v. Then we have 




Lq(Owl) is the x-threshold word for pi 
((0wl)‘^)fc,:fc,+q = 0'^+^ if and only if po<,{xkP) < x 
((0wl)‘^)i = 0 if and only if x^ < x 


Owl is the x-threshold word for pQ^ pi 
X G [yniw^yww] as Owl satisfies iL„ 


X G [yoiw.yww] by Proposition 12 


Symmetrically we may conclude that tt = Owl = Rq{0wl) x G [yoiw^yiow]- Therefore 
iL„_i_i is true. 

This completes the proof. □ 


8 Continuity of the Index 

We showed that the Whittle index is increasing on the domain of each fixed Christoffel word. 
However, we also need to show that the index is continuous as we move between words. So 
here we prove the following proposition. 

Proposition 17. Suppose A(-) is as in the main paper. Then A(x) is a continuous funetion of 
X G ffi+. 

We use the following definitions. 

Definition. Let w be the reverse of word w, w“ be the word constructed by concatenating 
w infinitely many times, |w| be the length of word w and |w|„ be the number of times that 
word u is a factor of w. 

Definition. For a possibly-infinite word w and numbers x G M,/3 G (0, 1) define 

|U )|-1 

S{w,x) := ^ /3‘^py,^.^{x) 

n—0 

1 _ fllOujll 

A(0wl, x) ;= —-——— (S'((01w)‘^, x) — S'((10w)“, x)). 

Remark. If tt is the x-threshold word then A(x) = A(7r, x) where A(x) is the Whittle index. 
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Remark. For a word ab, this definition gives 


S{ab, x) = S{a, x) + /3^‘‘^S(b, (/>a(x)) (13) 

so for \(j)a‘^ (a:)| < oo and /3 e (0,1) we have 

S{a‘^b,x) = S{a‘^ ,x). (14) 


Further, if Xa = (l>a{xa) then the formula for the sum of a geometric progression gives 


S{a‘^,Xa) 


S{a,Xa) 

l-/3l“l ■ 


(15) 


Definition. Let be the range of x for which the cc-threshold word is tt. 

The following construction is closely related to the beautiful Christoffel tree (Berstel et al, 
2008). 

Definition. Consider the mapping C which takes a sequence of words and returns a 
sequence containing the original words mingled with the concatenation of neighbouring words 
as follows: 


C{{a, b,c,d,...,x, y, z)) := (a, ab, b, be, c, cd,d,..., x, xy, y, yz, z). 

Now consider the sequences tk ■= 1)) for fc > 0. The first few such sequences are 

to = ( 0 , 1 ) 

ti = ( 0 , 01 , 1 ) 

t2 = (0, 001, 01, oil, 1) 

to = (0, 0001, 001, 00101, 01, 01011, on, oiii, i). 

Remark. If u G then |u| > 1 for any fc > 0. Now suppose u, v are adjacent in and we 
have |mu| > fc + 2. Then contains u,uv,v from which we can construct uuv and uvv. But 
\uuv\ = ju| + |mu| >l + fc + 2 = fc + 3 and \uvv\ = \uv\ + |u| >fc + 2 + l = A: + 3. Thus, by 
induction, we have shown that 

|uu| > k + 2 for any adjacent pair u, v in tk and any fc > 0. (16) 


8.1 Long Common Prefixes 


We gather the results needed to prove Proposition Most of these results these relate to the 
notion that if \x — y\ is small and a, b are the x- and y-threshold words, then words a, b usually 
have a long common prefix, although this is not always the case. 

The following simple result is repeatedly used in the other Lemmas of this subsection. 

Lemma 1. Suppose (Oal, 061) is a standard pair. Then alOb — 601a. 


Proof. As (Oal, 061) is a standard pair, 0al061 =: Owl is a Christoffel word. As Oal, 061, Owl 
are Christoffel words, a, 6, w are palindromes. Thus al06 = w = w = 601a = 601a. □ 


If (Oal, 061) is a standard pair, then the interval Xqi,i is immediately to the left of Aioai(o6i)“ ■ 
Since the words 061 and 0al(061)‘^ can differ within the first few letters, continuity of A(a:) at 
X = supXo&i is not obvious. Similarly, is immediately to the left of Aoai. However, 

the factors 1 —ofiil a^d 1—appearing in the definitions of the corresponding Whittle 
indices are different for |a| < oo. Thus continuity of A(a:) at a; = supApai is not obvious. The 
next two Lemmas address these questions. 
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Lemma 2. Suppose (0ol,061) is a standard pair and let x = 4>iQh(x). Then 


A(061,a;) = A(0al(051)‘^, a:). 


Proof. The right-hand side A(0al(051)‘^,x) involves the sum 


S'(10al(061)‘^,x) = 5'(10601a(106)‘^,x) 

by Lemma [^ 

= 5(106, x) + /3bo'>l^(oia(io6)“, (^iob(x)) 

by [HI 

= 5(106, x) -f /3l^°'’l5(01a(106)“, x) 

as X = (/>iofc(x) 

= (1 - /3li°'’l)5((106)‘^,x) -f / 31 1°'’!5(010(106)“^,x) 

by[T^ (17) 

Now we note that repeated application of Lemma [^ gives 


01a(106)‘^ = 010106(106)“^ = 016010(106)“^ = (016)“^ 

^Olo. (18) 

Thus 


1 _ fl|0al(0bl)“| 

A(0al(061)‘^,x) = ^ (5((01ol(061)“^)“^,x) 5((10ol(061)‘^)‘^, x)) 

5(01al(061)“^,x) - 5(10al(061)“^,x) 

1-/3 

by[T4] 

1 _ fl|106| 

= (5(01a(106)-,x) - 5((106)-,x)) 

by [17] 

1 _ fl|106| 

= (^((016)-, x) - 5((106)-,x)) 

by [18] 

= A(061, x). 



This completes the proof. □ 

Lemma 3. Suppose (0ol,061) is a standard pair and let x = (foiaix). Then 

A((0al)‘^061, a;) = A(0al,a:). 


Proof. The left-hand side A((0al)‘^0&l, a;) involves the sum 


S'(01(al0)‘^0W,x) = 5'(01(al0)‘^,a:) 

= S{01a, x) + cj>oia[x)) 

= ^(Ola, x) P a;) 

= (1 - /3l°i“l)S'((01a)‘^,a;) -f x) 


by [14] 
by|l3| 
as X = (foiaix) 

by [15] (19) 


Thus 

A((0al)‘^061,x) = ■-^-- (S'((01(al0)‘^0W)‘^, x) - S'((10(al0)‘^0W)‘^, x)) 

= (S'(01(al0)“061,x) - 5'((10a)“,x)) by[T4| 

1 _ flIOlal 

^ l-j 3 (>S'((0lQ)‘^,a^) - >g((10a)‘^,x)) by[^ 

= A(0al, x). 

This completes the proof. □ 
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To demonstrate continuity at other points, we will need to rely on the fact that nearby 
words often have a long common prefix as shown by the following two Lemmas. 

Lemma 4. Suppose (Oaf, 061) is a subsequence of tk for some k > 1. Then 0601a is a prefix 
of both ( 0 al)“ and 06(016)'^. 

Proof. Let a = b ■ ■ ■ indicate that 6 is a prefix of word a and consider the statements 

A{a, 6 ) : (al0)“ = 6 • • • and B{a, 6 ) : (601)“^ = a - ■ ■ . 

It suffices to show that A{a,b) and B{a,b) are true for any adjacent words 0al,061 in tk for 
k > 0. This is because 

A{a, 6 ) ^ (0al)‘^ = 0010 ( 010 )“^ = 0al06 • • • = 0601a • • • 
where the last equality follows from Lemma and 

B{a, 6 ) ^ 06(016)“^ = 0601(601)“^ = 0601a • • • 
which are the claims of the Lemma. 

We shall use induction. Take 62 = (0,001,01,011,01) as the base case. We must show 
that ^(0, e), i3(0, e), A(e, 1), i3(e, 1) are true. However these statements are respectively that 
( 001 )“ = £•••, ( 01 )“ = 0 • • • , ( 10 )“ = 1 • • • , ( 101 )“ = e • • • and are all true. 

Otherwise, say A{a,b), B{a,b) are true for any adjacency 0al,061 in tk- Let 0al061 = Ocl 
so 


c = al06 = 601a 

using Lemmaj^again. Then the statements A{a,c),B{a,c),A{c,b),B{c,b) are all true as 

(al0)“ = al0(al0)“ = al06 •••=€••• by A{a, 6) and as c = al06 

(c01)“ = c- ■ ■ = a- ■ ■ asc = al06 

(cl0)“ = c- ■ ■ = b- ■ ■ asc = 601a 

(601)“ = 601(601)“ = 601a ■■■=€■■■ hy B{a, b) and as c = 601a. 

Thus A{a, b),B{a, 6) are true for all adjacencies Oal, 061 in tk+i- This completes the proof. □ 

Lemma 5. Suppose Oal, 061 are adjacent in tk and that Ocl lies strictly between them in tk' 
for some 0 < k < k'. Then Ocl = 0601a • • • . 

Proof. The interval of tk' between Oal, 061 is constructed from Oal, 061 in exactly the same way 
as tk'-k was constructed from 0,1. Thus Ocl = (0al)'^061 • • • for some positive integer q. Now 
recall that 0601a = 0al06 by Lemma[^ Thus Ocl = (0al)'^“^0al061 • • • = (0al)'^“^0601al • • • = 
06(01a)'^l • • • = 0601a • • • as claimed. □ 

Although the existence of a long common prefix for nearby words suggests continuity, to 
prove anything we must bound the residual after removing the long common prefix. The 
following Lemma is one way to achieve this. 

Lemma 6. Suppose x > y > 0, let Owl be the x-threshold word and let (01w)“ = .su, (10w)“ = 
s'u' where |s| = |s'|. Then \S{u,(l)siy)) - S{u',(j)s'iy))\ < 

Proof. The highest point on the orbits (0((oim)i^)^,, (x) : A: > 0) and (())((io,o)i^)j.j, (x) : fc > 0) is 
X + 1 since Owl is the x-threshold word. The terms a^, bk of the discounted sums 

00 00 

S{u,(j)s{y)) =■■ '^P'^Ok and S{u',(j)s'{y)) =■ '^P'^bk 

are from the orbits (()>((oiu,)-)i,^ (y) : k > 0) and (<?i((ioii;)“)i, Jy) : A: > 0) and 4>u"{x) > pwiy) 
for any word u" as x > y. Therefore terms ak,bk, are also no higher than 4’o{x) < 
X -I- 1. Furthermore, terms ak,bk are non-negative, so that |afe —6fe| < x -I- 1. Thus 
|S'(u,(/>^(y)) - S{uP(j)s'iy))\ < - &fc| < ° 
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Although it is clear that A(7r,a;) is continuous, a bound on its slope is helpful. 


Lemma 7. Suppose a; > 0 and that Owl is a valid word. Then |A'(0wl,a;)| < 
Proof. The definition of A(0wl,x) gives 


|A'(0wl,a;)| < 


1 _ p\owi\ 


1- 






( 01 ^/;) 




< 


1-/3 




1 




(l-/3)2 


where the second inequality follows as 0 < /3l°“'^l < 1 and 0 < ^ ^or any word u since 

0 < if'iix) < (/>o(x) <1. □ 


We use one more result about of the main paper. 

Lemma 8. Suppose 4>o{x) and 4>i{x) are as in the main paper and x G K+. Then (j)oi{.x) < 

0 io(x). 

Proof. The definitions of (/>o, 4>i give 


<Pio{x) - (l)oi{x) = 

. , (u6 -\- b {‘2ab -t- 3b -t- 3a -t- 2)x -t- ab -t- 26 -t- 2ci -t- 3 

(^{ab T 6 T a^x -t- ab -t- 6 -t- 2ct -t- 1)(((26 -t- 6 -t- a^x T ab -t- ‘lb -t- u -t- 1) 

which is positive as b > a and x > 0. □ 

Our proof of continuity will rely on the standard (e, 6) definition in which we will put S = Ik 
where Ik is defined in the following Lemma. 

Lemma 9. For any e > 0 there is a k < oo such that 0 < Ik '.= inf{|A 7 r| : tt G tk} < e. 

Proof. Say Oal, 061 are adjacent in tk. Then by construction of tk+i, the gap (ziob, -Zoia) contains 
2® — 1 intervals corresponding to words of tk+i\tk- Each of these intervals is at most 
in length. Thus limfe_>oo h = 0. This demonstrates the existence of a fc < oo such that Ik < e. 

To show that Ik > 0 for finite k, we shall demonstrate that assuming Ik = 0 leads to a 
contradiction. If 4 = 0 then there is some word Owl G tk such that Ziou, = ^oiju =: x. 
Therefore 4>ww{x) = 4>mw{x). Now in K_|_, functions (l)o{x), 4>i{x) have inverses, so 4)~^{x) is 
well-defined. Therefore 


</'io(a;) = o f^iowix) = (j)J o (j)Qi.^{x) = ^01 (a::) 
which contradicts Lemmaas x > 0. 


□ 


8.2 Proof of Continuity 

Proof. We wish to show that for any e > 0, there exists a 6 > 0 such that for any \x — y\ <5 
we have A := |A(x) — X{y)\ < e. Without loss of generality we assume that x > y. 

Specifically, we shall put S = Ik > 0 where Ik is as defined in Lemma and k is any positive 
integer such that < | and such that 2 /3^~*~^ < |. The existence of such a fc is 

guaranteed by Lemmaj^and because j3 G (0,1). 

Let Oal, 061 be the x- and y-threshold words. If these words are the same then 

A = |A(0aI,x)-A(0al,y)| < |y-x| sup |A'(0aI,z)| < ^ 

zG[x,y] i^-Pr (1-P) 2 

where the second inequality follows from Lemma the third from \y — x\ < 6 = Ik and the 
fourth from the definition of fc. 
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Otherwise Oal 7 ^ 061. In this case, let (0el,061) be the standard pair for word 061, let 
a = </i’iOa(a) and 6 = (^oib(^)- Noting that y <b < a< x, our strategy is to write 


A = |Ai + A2 + A3 + A4 + As + Ael 

Ai := A(061,?/) — A(061,6) 

A 2 := A(061,6) - A(0el(061)‘^,6) 

A 3 := A(0el(061)‘^, 6 ) - A((0al)“, 6 ) 

A 4 := A((0al)"', 6 ) - A((0al)‘^, a) 

As := A((0al)‘^,a) - A(0al,a) 

As := A(0al,a) — A(0al,a:). 


Lemma and the choice of <5 give 


1 . 1 1 . 1 1 . 1 6 — v + o — 6 + a; — a 

|Ai| + IA4I + lAel < - ^ .r -= < 


(l-/3)2 


Ik ^ e 


while Lemmas and give 


A 2 = As = 0. 


( 20 ) 


( 21 ) 


It remains to consider A 3 . It follows from the definition of Ik, that for some adjacent words 
0cl,0dl in tk- either Oal = Ocl or Oal is a word strictly between Ocl and Odl in the sense of 
Lemmaand that 0el(061)“ is a word strictly between Ocl and Odl. Thus by Lemmawe 
have ( 00 !)“^ = Opu and 0el(061)‘^ = Opv where p := dOlc and u,v are the appropriate suffixes. 
Therefore the definition of \{w, x) gives 


IA 3 I = |A((0al)‘^,6)-A(0dl(061)“,6) 
1 


< 


5(01p, 6) + /3loipl^(u, 0oip(&)) - ^(lOp, 6) - ^|iopl5(u, (^iop(6)) 

-S{Qlp,h) - + S{lQp,b) + 

\S{u,(j)mp(b)) - S{u,(j)iopib)) - S{v,(j)oipib)) + S{v,(j)iop(b))\ 
{\S{u,(l)oip{b)) - S{u,(l)iopib))\ + \S{v,(l)oip(b)) - 5'('u,()iiop(6))|) 


1-13 

p\oip\ 

^|01p| 

^ /a + 1 6+1 

< . ^ ^ ^ 


< 


1-/3 yi- 

pk+i 


< 


(i-/?)2 

e 


2(:c + l) 


( 22 ) 


whe re t he last four inequalities follow from the triangle inequality, from Lemma from equa¬ 


tion 
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coupled with the fact that a<b<x and finally from the definition of k. 


Finally, coupling [20l and and using the triangle inequality gives 


A < 


= e. 


This completes the proof. 


□ 


9 Properties of the Linear-System Orbits M{w) 

Recall the definitions about words from the main paper, particularly that w is the reverse of w. 
Also, recall the definitions of matrices F, G, K, M{w). The first of the following propositions is 
used to prove the second. The second appears in the main paper. 
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Proposition 18. Suppose w, w' are any words. Then 


1. det{M{w)) = 1, 

2. M{w) = KM{w)-^K, 

3. M{w) = (^ eh-i for some e,f,h gR, 

4-. M{w) — M{w) = XK for some A € M, 

[M{w01w')]22 ^ [M{w10w')]22 
[M(w01w')]2i ~ [M(w10w')]2i ’ 

6. [M('u;)]22 > [M{w)]2i. 


Proof det(M(t(;)) = nl=i det{M{wi)) = 1 as det(F) = det(G') = 1 gives Claim 1. 

Claim 2. The definitions of F,G,K give KF = F~^K, KG = G~^K. Thus KM{w) = 
M{w\w\)~^ ■ ■ ■ M{wi)~^K = M{w)~^K. The result follows as = I. 

Claim 3. Put M{w) =: and solve det(M(w)) = I = eg — hf for g. 

Claim 4- Substituting Claim 2 and Claim 3 in Claim 4 gives M{w) — KM{w)~^K = {h—e—g)K. 
Claim 5. Put M := M{w),N := M{w'). We calculate 

[fVGJ^M]22[A^i^GM]2i - [A^GFM]2 i[A^J^GM]22 

= (6 — a){MiiM22 — Mi2M2i){{ab + 6 + a)N22 + (6 + o + 2)A^2i-A^22 + -^ 21 ) — ® 


as & > a > 0, det(M) = 1 and N > 0. The result follows as NFGM > 0 and NGFM > 0. 

Claim 6. If ic = e then [M('u ;)]22 — [FI{w)\ 2 i = 1 > 0. Otherwise we use induction on |r(;| to 
show that M{w)v > 0 where v := (—1,1)^. In the base case w G {0,1} so 

>0 for some c e {a, 6}. 


For the inductive step, assume w = {Ou, lu} for some word u satisfying M(u)v > 0. Then 


M{w)v 



M{u)v > 0 


for some c G {a, b}. 


As [M{w)v ]2 = [M(ri ;)]22 — [M('u;)] 2 i, this completes the proof. 


□ 


Proposition 19. Suppose w is a word, p is a palindrome and n > Then 
/fh±i f\ 

1. M{p) = ( A for some /, h G M, 

2. tr{M{l0p)) = tr{M{0lp)), 

3. If u G {p(lOp)”, (10p)”10} then M{u) — M{u) = XK for some X G M_, 
4- Ifw is a prefix of p then [M(p( 10 p)” 10 r (;)]22 < [M{p{ 01 p)'^ 01 w)] 22 , 

5. [M((10p)”10u>)]2i > [M((01p)"01ri;)]2i, 

6. [M((10p)”l)]2i > [M((01p)"0)]2i. 
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Proof. In this proof, we refer to Claim k of Proposition [T8| as Pk. 

Claim 1. P2 gives M{p) = KM{p)~^K as p = p. But in the notation of P3, [M{p)]ii = 
[KM{p)~^K]ii says e = h — {eh — 1)//. Solve this for e and substitute in P3. 

Claim 2. Noting that GF — FG = {b — a)K, the notation of Claim 1 gives 

tr(M(01p)) - tr(M(10p)) = iv{M{p){GF - GF)) = {h - a)tr ^ = 0. 

Claim 3. Note we can move from u to u just by swapping some 10 for 01. So, repeated applica¬ 
tion of P5 gives the inequality . But the denominators of this inequality are 

equal (and non-negative) as P4 gives [M('u)] 2 i — [M('u)] 2 i = A'i^ 2 i = 0 for some A' G M. Thus 
this inequality reduces to [M(m )]22 < [M{u)] 22 - Yet P4 also gives [M{u) — M{u )]22 = Ai ^22 
which combined with the previous sentence says that XK 22 < 0. As K 22 = 1, this gives A G M_. 


Claim 4 . Let s be the corresponding suffix so p = ws and 

M(p(10p)”10u;) - M{p{01p)’^Qlw) = M{s)-^{M{p{10p)^+^) - M{p{01p)^+^)) =: A. 

But Claim 3 with u = p(10p)"~^^ gives 

[A ]22 = A[M(s) ^Ar ]22 = [KM{s )]22 = A([M(s )]22 — [A1(s)]2i) < p 

for some A < 0 by P 2 

Claim 5. As M{w) > 0, Claim 3 with u = (10p)"10 gives 

[M(u;)(M((10p)”10) - M((01p)"01))]2i = A[M(u>)A:]2i = A[-M(w)]2i > 0. 

Claim 6 . Let E := . Then G — F = {b — a)E > 0, so that 

[GM((10p)”) - FM{{ 01 p)^)] 2 i = [{b - a)EM{{10p)^) + EM{{10p)^) - FM((01p)")]2i 

> [M((10p)”0) - M((01p)”0)]2i > 0 

by Claim 5. This completes the proof. □ 


10 Majorisation 

In the main paper, we used one result about majorisation which was similar-but-not-identical 
to any results in Marshall, Olson and Arnold (2011). Let us prove that result. 

Proposition 20. Suppose x,y G and / : K —> M is a symmetric function that is convex 
and decreasing on K+. Then x y and (3 G [0,1] => YllLi — YllLi f{y{i))- 

Proof As the claim relates to a;(i) and we assume that Xi and yi are in ascending order. 

Marshall et al (3H2B, page 133) says that if g ; A —t M is a non-decreasing and convex 
function on A C M and (ui,... ,Mrn) is a non-increasing and non-negative sequence, then for 
all non-increasing sequences (pi,... ,Pm) the function (j){a) := is Schur-convex. 

Indeed the function / is increasing and convex for p G K_ (such as p = —x and p = —y) 
and (/3,..., /?"*) is a non-increasing and non-negative sequence for f3 G [0,1]. Thus for all non¬ 
increasing sequences (pi,... ,Pm) on the function ip{p) := is Schur-convex. 

Recall {ibid, page 12) that a G K™ is said to be weakly submajorised by 5 G M™, written 
a -<tu b if 

k k 

6[i], k = 1,..., TO where a[i] denotes a in descending order 

i=l i=l 
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and that x -<w y ^ —a —b {ibid, page 13). 

However {ibid, 3A8, page 87) if (j){p) is a real function on A C M™ which is non-decreasing 
in each argument pi and Schur-convex on A and p <yj q on A then (j){p) < 4>{q). 

Indeed, the function i/'Cp) = /3V(Pi) is a real function on which is non-decreasing 

in each argument and Schur-convex on for all non-increasing sequences {pi,... ,Pm)- Fur¬ 
thermore, —y —X as X y. Therefore '4’{y) = '4’i~y) ^ '^{~x) = 4’{x) as claimed. □ 


11 Clarification of Theorem 1 for Q < x < yi ov hq < x < oo 

Recall the following definitions and assumption from the main paper 


F := 


1 1 
a 1 -I- a 


G := 


1 1 
b 1 + b 


E ■= 


0 0 
1 1 


){x) := 


6 > a > 0. 


If 0 < a; < j/i or t/o ^ 3^ < oo then the relevant linear systems, (9) in the main paper, are 

(M(l'=+i) - M{01^))v{x) = {G- F)G^v{x) = {b - a)EG’^v{x) > ol 
{M{1Q^)-M{Q^+^))v{x) = {G-F)F'^v{x) = {b-a)EF^v{x)>Q] ^ + 

where both inequalities follow as E,F,G are all > 0, as 5 > a and as a; > min{yo 5 j/i} > 0. 
Therefore all cumulative sums of the above expressions are non-negative so the derivative of the 
numerator of the Whittle index is non-negative by the same weak-supermajorisation argument 
as in the main paper. 

Meanwhile, the denominator of the index in these cases is 

OO OO 

^ P\{ink+i - (01“)fe+i) = /3 = E/3"((10“)fc+i - r)fc+i) 

fc =0 fe =0 

which is non-negative. Therefore the rest of the proof of Theorem 1 follows as in the main 
paper. 

In fact we could say that the majorisation point, which is 4)w{0) for words Owl in the 
main paper, is —I in both cases. Indeed, Claim 6 of Proposition 4 of the main paper says 
that Fv{—1) = Gv{—1) = v{0). Also, Ev{—1) = (0,0)^. Thus for all k € Z+, EG'^v{—l) > 
EE^v{—l) > 0 whereas Ev{—\ — e) < 0 for any e > 0. 
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